-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve group by hash performance: avoid group-key/-state clones for hash-groupby #4651
Improve group by hash performance: avoid group-key/-state clones for hash-groupby #4651
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me . Thank you @crepererum
cc @Dandandan and @tustvold
accumulators | ||
.group_states | ||
.iter() | ||
.map(|group_state| group_state.group_by_values[i].clone()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
.into_iter() | ||
.map(|group_state| { | ||
( | ||
VecDeque::from(group_state.group_by_values.to_vec()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 maybe we could use a VecDeque always and could avoid this copy too 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a copy. Due to move semantics, Rust Std should just reuse the pointer to the allocated backing array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice @crepererum ! |
Benchmark runs are scheduled for baseline = 4ec559d and contender = 9667887. 9667887 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
-
Rationale for this change
Just found a bunch of CPU-cycles spent in
clone
for large aggregations that involve strings. Seems that we don't need to clone that much data.What changes are included in this PR?
A bit more data moving instead of cloning.
Are these changes tested?
Are there any user-facing changes?
Faster group-bys.